Empirical Risk Minimization for Probabilistic Grammars: Sample Complexity and Hardness of Learning
نویسندگان
چکیده
Probabilistic grammars are generative statistical models that are useful for compositional and sequential structures. They are used ubiquitously in computational linguistics. We present a framework, reminiscent of structural risk minimization, for empirical risk minimization of probabilistic grammars using the log-loss. We derive sample complexity bounds in this framework that apply both to the supervised setting and the unsupervised setting. By making assumptions about the underlying distribution that are appropriate for natural language scenarios, we are able to derive distribution-dependent sample complexity bounds for probabilistic grammars. We also give simple algorithms for carrying out empirical risk minimization using this framework in both the supervised and unsupervised settings. In the unsupervised case, we show that the problem of minimizing empirical risk is NP-hard. We therefore suggest an approximate algorithm, similar to expectation-maximization, to minimize the empirical risk.
منابع مشابه
Computational Learning of Probabilistic Grammars in the Unsupervised Setting
With the rising amount of available multilingual text data, computational linguistics faces an opportunity and a challenge. This text can enrich the domains of NLP applications and improve their performance. Traditional supervised learning for this kind of data would require annotation of part of this text for induction of natural language structure. For these large amounts of rich text, such a...
متن کاملEmpirical Risk Minimization with Approximations of Probabilistic Grammars
When approximating a family of probabilistic grammars, it is convenient to assume the degree of the grammar is limited. We limit the degree of the grammar by making the assumption that Nk ≤ 2. This assumption may seem, at first glance, somewhat restrictive, but we show next that for probabilistic context-free grammars (and as a consequence, other formalisms), this assumption does not restrict g...
متن کاملOn the Fine-Grained Complexity of Empirical Risk Minimization: Kernel Methods and Neural Networks
Empirical risk minimization (ERM) is ubiquitous in machine learning and underlies most supervised learning methods. While there has been a large body of work on algorithms for various ERM problems, the exact computational complexity of ERM is still not understood. We address this issue for multiple popular ERM problems including kernel SVMs, kernel ridge regression, and training the final layer...
متن کاملRademacher penalties and structural risk minimization
We suggest a penalty function to be used in various problems of structural risk minimization. This penalty is data dependent and is based on the sup-norm of the so called Rademacher process indexed by the underlying class of functions (sets). The standard complexity penalties, used in learning problems and based on the VCdimensions of the classes, are conservative upper bounds (in a probabilist...
متن کاملA Sample Complexity Measure with Applications to Learning Optimal Auctions
We introduce a new sample complexity measure, which we refer to as split-sample growth rate. For any hypothesis H and for any sample S of size m, the split-sample growth rate τ̂H(m) counts how many different hypotheses can empirical risk minimization output on any sub-sample of S of size m/2. We show that the expected generalization error is upper bounded by O ( √
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Computational Linguistics
دوره 38 شماره
صفحات -
تاریخ انتشار 2012